UBC/SFU-Shieh-MC1
Student Team: YES
TABLEAU
MYSQL
Adobe Illustrator (For Mapping)
Video:
Answers to Mini-Challenge 1 Questions:
MC 1.1 Create
a visualization of the health and policy status of the entire Bank of Money
enterprise as of 2 pm BMT (BankWorld Mean Time) on
February 2. What areas of concern do you observe?
The goal here was to create a single
visualization that fits onto one screen, a dashboard, that
allows someone to easily see and explore the overall health and status of the
network. Exploration is facilitated by highlighting and filtering actions built
into the dashboard. Since the overall majority of computers are completely
healthy, defined as policy status equals 1 and activity flag equals 1, they are
filtered out of the visualization except for the lower table. This removes
noise from the visualization and allows one to focus on anomalies.
Figure 1. Overall
health dashboard at 2pm with no highlighting or filtering.
In figure 1, we can see that overall, the
majority of computers with deviated policy statuses are servers and that they
are mostly at policy status 2. For the most part the system is completely
healthy (~80%).
Figure 2. Overall
health dashboard at 2pm highlighting policy status 5.
Here we see that there is only one
computer reporting a policy status 5, a possible virus detected. This is at
Datacenter 2, HQ.
Figure 3. Overall
health dashboard at 2pm highlighting Region 5 and 10.
These two Regions stand out from the rest
because no computers there are reporting a policy status of 1. Almost all
computers are reporting a policy status of 2 and some higher.
MC 1.2 Use
your visualization tools to look at how the network’s status changes over
time. Highlight up to five potential anomalies in the network and provide a
visualization of each. When did each anomaly begin and end? What might be an
explanation of each anomaly?
Anomaly 1
Here we explore
findings from MC 1.1 Figure 3. Using the whole Health Time range we can see
that Region 5 and 10 starts with no completely healthy computers reporting and
that this anomaly continues to the end of reporting. We can compare this to a
typical region using Region 1 as an example. Region 1 starts with almost all
computers reporting as completely healthy instead. Although Region 5 and 10
start with all computers reporting at policy status 2 the rate of increase of
reporting of policy statuses 3, 4, and 5, are the same as that of a typical
region.
Figure 4. Policy
status distribution by region. Right chart shows all policy statuses
while the left shows only 4 and 5.
As seen from MC
1.1 Figure 2, there was one computer reporting policy status 5 at Data center
2. This is seen right from the start of reporting. Let's keep these findings in
mind while looking at anomaly 2.
Anomaly 2
Here we explore
Datacenter 5 that is situated in Region 10. While looking at the number of IP
addresses reporting overall we see that there is a large spike at 2 Feb., 6:00
PM. Digging down we look at the number of IP addresses reporting per time zone
and region. We see that Business Unit Headquarters shows an abnormal spike in
reporting in Time Zone -6 that is not seen in any other region or time zone. We
then see that this only occurs in the Facility Datacenter 5. There is very
little reporting since the start of reporting with spikes in increases of the
number of IPs reporting at 2:30 PM, 6:00 PM, and 7:00 PM. By 7:15 PM the number
of IPs reporting matches that of other Datacenters. We use Datacenter 2 as a
comparison. Remember that Datacenter 2 was the location of the first detected
Policy Status 5. Whats interesting here is that after
returning to reporting normalcy we see that the rates of change of Policy
Status mimics that of the other Datacenters. Furthermore, we see that at 7:15
PM the level of IPs reporting reached is the near the same level as in the
other Datacenters. It's as if this Datacenter was experiencing the same
deterioration in Policy Status as the other Datacenters but for some reason was
just not reporting.
Figure 5. A. Number of IP
Addresses Reporting per Time Zone and Selective Regions. B. Number of IP
Addresses Reporting Overall. C. Policy Status of Data Centers 2 and 5.
Figure 6. Activity
Flag of Data Centers 2 and 5.
We also see a
similar pattern when looking at reported Activity Flags. That is, abnormally
low numbers reporting at start, then spiked increases,
then normal reporting with similar patterns to other Datacenters. These
findings, coupled with the findings in Anomaly 1, strongly suggest that the
system was already infected by malicious software that degrades system health
at some indeterminate time before reporting started.
Anomaly 3
Now that we have
established where the first Policy Status 5 was detected and other problematic
areas we should now look at how Policy Status 5 propagates through the network.
Figure 7. A. All reported
policy status over the whole period. B. Map of policy status 5 spread. C. Percent
policy status reported by region. D. Percent policy status reported by machine
class and function.
In Figure 7 A,
we can see that the network experiences a gradual deterioration in policy
status. The bumps can be attributed to the turning on and off of workstations
that do not report in the early mornings. In B. we can see that the spread is
throughout the system; no regions are unaffected. The spread did not seem to
have a specific pattern as seen using the Page function in Tableau that can't
be seen in the snapshot but is shown in the video. In C, we see that policy
statuses are split in roughly the same proportions throughout the different
regions with the exception of 5 and 10. Those regions do not have any Policy
Status 1's reporting but the number of Policy Status 2's seem to be the
addition of PS=1 and PS=2 in a "normal" region. In D, we see that
Policy Status distributions are proportional across all machine class and
machine functions. These findings strongly suggest that the malicious software
affecting the network is not specifically targeting any particular
characteristic of the network.
Acknowledgements
Thanks to the
Vancouver Institute of Visual Analytics (VIVA) and the MAGIC lab at UBC for
providing access to tools and lab space.